Skip to content

CMake and linking fixes for PyPi builds#174

Merged
scal444 merged 2 commits into
NVIDIA-Digital-Bio:mainfrom
scal444:pip-wheels-minimal-container-fix
May 18, 2026
Merged

CMake and linking fixes for PyPi builds#174
scal444 merged 2 commits into
NVIDIA-Digital-Bio:mainfrom
scal444:pip-wheels-minimal-container-fix

Conversation

@scal444
Copy link
Copy Markdown
Collaborator

@scal444 scal444 commented May 15, 2026

The pip-build path was globbing every .so in rdkit.libs/ into RDKit_LIBS and every libboost_* into Boost_LIBRARIES, then linking each into every nvmolkit Python module. That dragged libcairo and libquadmath onto each module's NEEDED list. libcairo in turn NEEDEDs libXrender/libX11/libXext; those are on the manylinux lib_whitelist so rdkit-pypi's auditwheel pass legally left them external, but the nvidia/cuda runtime container we test in doesn't ship them, so import nvmolkit.fingerprints failed at load.

Narrow RDKit_LIBS to the 16 components nvmolkit actually uses (mirroring the conda path's explicit list) and Boost_LIBRARIES to the same 4 (boost serialization / iostreams / python / numpy). Also patchelf with --force-rpath so the entry-point modules get DT_RPATH instead of DT_RUNPATH. The libs inside rdkit.libs/ have no rpath of their own and rely on RPATH inheritance to resolve second-level deps; rdkit's own python bindings do the same. Drop the unused ${RDKit_LIBS} link from _arrayHelpers, which touches zero RDKit symbols.

The pip-build path was globbing every .so in rdkit.libs/ into RDKit_LIBS
and every libboost_* into Boost_LIBRARIES, then linking each into every
nvmolkit Python module. That dragged libcairo and libquadmath onto each
module's NEEDED list. libcairo in turn NEEDEDs libXrender/libX11/libXext;
those are on the manylinux lib_whitelist so rdkit-pypi's auditwheel pass
legally left them external, but the nvidia/cuda runtime container we test
in doesn't ship them, so `import nvmolkit.fingerprints` failed at load.

Narrow RDKit_LIBS to the 16 components nvmolkit actually uses (mirroring
the conda path's explicit list) and Boost_LIBRARIES to the same 4 (boost
serialization / iostreams / python<ver> / numpy<ver>). Also patchelf with
--force-rpath so the entry-point modules get DT_RPATH instead of
DT_RUNPATH. The libs inside rdkit.libs/ have no rpath of their own and
rely on RPATH inheritance to resolve second-level deps; rdkit's own
python bindings do the same. Drop the unused ${RDKit_LIBS} link from
_arrayHelpers, which touches zero RDKit symbols.

Verified the rdkit==2026.3.1 py3.12 wheel built with this change loads
and passes the full pytest suite (390 passed, 10 long deselected) on an
H200 in the manylinux+CUDA container with no system X/font libs.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 15, 2026

Greptile Summary

This PR fixes the PyPI build path to avoid dragging unneeded X11/cairo/quadmath shared libraries onto every nvmolkit Python module's NEEDED list, which caused import nvmolkit.fingerprints to fail in CUDA runtime containers that don't ship those X11 libs. It also switches patchelf from DT_RUNPATH to DT_RPATH so second-level transitive deps inside rdkit.libs/ can resolve their siblings at runtime.

  • cmake/rdkit.cmake and cmake/boost.cmake replace broad glob-everything-in-rdkit.libs/ logic with per-component targeted globs for the 16 RDKit and 4 Boost libraries nvmolkit actually uses, with FATAL_ERROR guards if any expected .so is absent or ambiguous.
  • admin/distribute/repair_wheel.sh adds --force-rpath to the patchelf call so the set RPATH is recorded as DT_RPATH (inherited by transitive loads) rather than DT_RUNPATH (not inherited), matching the approach used by rdkit's own Python bindings.
  • nvmolkit/CMakeLists.txt drops the spurious ${RDKit_LIBS} private link from _arrayHelpers, which references no RDKit symbols.

Confidence Score: 5/5

Changes are narrowly scoped to the pip-build CMake wiring and the post-auditwheel patchelf step; the conda/system build path is structurally unchanged.

All four changes are logically correct and internally consistent. The component-by-component glob approach with FATAL_ERROR guards is strictly safer than the previous unconstrained glob. The DT_RPATH switch is deliberate, well-documented, and matches rdkit's own practice. Removing the dead RDKit link from _arrayHelpers is a clean-up with no risk of breakage.

No files require special attention.

Important Files Changed

Filename Overview
admin/distribute/repair_wheel.sh Added --force-rpath to patchelf call so entry-point .so files carry DT_RPATH (inherited by transitive deps) instead of DT_RUNPATH (not inherited); comment expanded to explain the rationale.
cmake/boost.cmake BOOST_TARGET_LIBS now shared between both build paths; pip path globs exactly one .so per named component (serialization/iostreams/python/numpy) instead of filtering the entire rdkit.libs glob, with FATAL_ERROR guards for unexpected match counts.
cmake/rdkit.cmake Introduces NVMOLKIT_RDKIT_COMPONENTS list shared by both code paths; pip path now resolves exactly 16 named libRDKit*.so.* files instead of globbing every file in rdkit.libs/, with FATAL_ERROR guards for missing or ambiguous matches.
nvmolkit/CMakeLists.txt Removes the unused ${RDKit_LIBS} private link from _arrayHelpers, which touches no RDKit symbols; all other module link lines are unchanged.

Reviews (2): Last reviewed commit: "Merge branch 'main' into pip-wheels-mini..." | Re-trigger Greptile

@scal444 scal444 requested a review from evasnow1992 May 18, 2026 22:26
Copy link
Copy Markdown
Collaborator

@evasnow1992 evasnow1992 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me. Thanks!

@scal444 scal444 merged commit 20e5b28 into NVIDIA-Digital-Bio:main May 18, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants